Elliptic Constructions: Spotting Patterns in UD Treebanks
نویسندگان
چکیده
The goal of this paper is to survey annotation of ellipsis in Universal Dependencies (UD) 2.0 treebanks. In the long term, knowing the types and frequencies of elliptical constructions is important for parsing experiments focused on ellipsis, which was also our original motivation. However, the current state of annotation is still far from perfect, and thus the main outcome of the present study is a description of errors and inconsistencies; we hope that it will help improve the future releases.
منابع مشابه
Gapping Constructions in Universal Dependencies v2
In this paper, we provide a detailed account of sentences with gapping such as “John likes tea, and Mary coffee” within the Universal Dependencies (UD) framework. We explain how common gapping constructions as well as rare complex constructions can be analyzed on the basis of examples in Dutch, English, Farsi, German, Hindi, Japanese, and Turkish. We further argue why the adopted analysis of th...
متن کاملShould Have, Would Have, Could Have. Investigating Verb Group Representations for Parsing with Universal Dependencies
Treebanks have recently been released for a number of languages with the harmonized annotation created by the Universal Dependencies project. The representation of certain constructions in UD are known to be suboptimal for parsing and may be worth transforming for the purpose of parsing. In this paper, we focus on the representation of verb groups. Several studies have shown that parsing works ...
متن کاملCoNLL ’ 17 : UD Shared Task
This paper describes LIMSI’s submission to the CoNLL 2017 UD Shared Task, which is focused on small treebanks, and how to improve low-resourced parsing only by ad hoc combination of multiple views and resources. We present our approach for low-resourced parsing, together with a detailed analysis of the results for each test treebank. We also report extensive analysis experiments on model select...
متن کاملFrom Universal Dependencies to Abstract Syntax
Abstract syntax is a tectogrammatical tree representation, which can be shared between languages. It is used for programming languages in compilers, and has been adapted to natural languages in GF (Grammatical Framework). Recent work has shown how GF trees can be converted to UD trees, making it possible to generate parallel synthetic treebanks for those 30 languages that are currently covered ...
متن کاملConverting an English-Swedish Parallel Treebank to Universal Dependencies
The paper reports experiences of automatically converting the dependency analysis of the LinES English-Swedish parallel treebank to universal dependencies (UD). The most tangible result is a version of the treebank that actually employs the relations and parts-of-speech categories required by UD, and no other. It is also more complete in that punctuation marks have received dependencies, which ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017